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Abstract 

The aim of the present study was to translate and assess the reliability 
and validity of the Theory of Mind task battery designed by Hutchins & 
Prelock 2010, for use as an assessment of the theory of mind in normally- 
developing children, such behavior has implications on many aspects of 
children’s life, such as social competence, peer acceptance and early 
success in school. 367 normally-developing monolingual children ranging in 
age from 3 to 12 years participated. A back to back translation method was 
used with the theory of mind task battery. A subset of 49 children 
participated in a retest 10 to 14 days after the initial assessment. Overall 
results indicated that the Arabic adaptation of the test can be considered 
reliable for assessing theory of mind within children. However, analyses of 
item difficulty indicated some differences between the Arabic adaptation of 
the theory of mind task battery and the English test, therefore, the 
arrangement of the tasks was changed and task D (Line of sight) was 
oe and became the last task tested because it was the most difficult for 
all ages. 
Keywords: Arabic adaptation, Assessment, Children, Reliability, 
Theory of Mind, Validation. 


Introduction 

Theory of Mind (ToM) is considered a person’s ability to 
understand his\her own mind and the minds of others, but the 
term also includes a social-cognitive skill with implications for 
many aspects of children’s functioning, such as social 
competence, peer acceptance and early success in school 
(Carlson, Koening & Harms, 2013). ToM also includes 
understanding of various mental states, as well as the ways in 
which action is shaped by such mental states and experiences, 
both in straightforward situations and in complicated ones, 
where the mind and action are at odds because of forgetting, 
ignorance, false beliefs, accident, or error (Wellman, Fuxi, & 
Peterson, 2011). 


Two neural systems are involved in processing ToM: one for 
perceiving others’ beliefs and intentions, which is the cognitive 
component; and one for understanding others’ emotions, which 
is the affective component (Poletti, Enrici, & Adenzatoc, 2012). 
These components can be differentiated on neural levels, 
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where the parietal cortex during ToM processing is involved; 
yet, ToM recruits a network of brain structures, irrespective of 
the differentiation between the affective and cognitive 
subcomponents (Bodden, et al., 2013). 


According to Mahy, Moses, and Pfeifer (2014, p. 69), “ToM 
development is driven by an innate neural mechanism 
dedicated to mental state reasoning. Although experience may 
be important in triggering this mechanism, it cannot revise the 
mechanism's basic nature.” The development of ToM was 
conceived as occurring in four phases. Phase 1 starts before 
the age of two and a half years where children begin to develop 
some sort of nonrepresentational understanding of perception 
as well as social referencing and object permanence abilities. 
Phase 2 occurs between two and a half and 3 years. In this 
stage, children show some level of understanding of 
nonrepresentational states such as perception and desire, and 
can solve simple desire tasks, for example when the children 
are told about a character who had a choice of either a cookie 
or a carrot for a snack, then they are asked which of these 
snacks they would choose for themselves, so if they chose the 
cookie they will then hear that the character likes the carrot, 
after that they will be asked to predict which snack the 
character would choose, generally children pass this phase at 
an early age. Phase 3 starts at the age of 3 years, where 
children can understand representational aspects of desire and 
their perception improves; at this age they can also invoke 
preliminary accounts of misrepresentation if they are confronted 
by direct counter-evidence. An example task, children are told 
about a character who had a choice of looking for her cat under 
the porch or in the garden, then they are asked where they 
themselves look for the cat, so if the child thinks the cat is in the 
garage they will be told that the character in the story thinks that 
the cat is under the porch, next the children will be asked to 
predict which one of the locations the story character would 
choose to look for the cat (Liu, Meltzoff, & Wellman, 2009), and 
(Gopnik, Slaughter, & Meltzoff, 2014). 
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Finally, phase 4 begins around 4 years, when children can 
generalize the notion of representation from perceptual context, 
and develop a general predictive and applicable notion of false 
belief, such understanding is usually measured using standard 
tasks such as the Change in Location or Unexpected Contents, 
for example children are shown a crayon box that contains 
candles, then asked what is in the box, after that they will be 
asked what would another person think is in the crayon box, 
three year olds would say that that the character would think 
that the box contains candles, not crayons, while four and five 
year olds would say the character would think it has crayons 
inside. (Atancea, Bernsteinb, & Meltzoffc, 2010), and (Gopnik, 
Slaughter, & Meltzoff, 2014). 


ToM starts showing in the behavior of children as young as 
15 months old, when they are able to imitate behaviors they see 
and can re-enact the goals and intentions of a person based on 
unsuccessful acts, for example a child would see that someone 
is going to push a buzzer but fails to do so, an 18 month old 
child would be able to imitate that Pearson and push the buzzer 
even when the Pearson fails to (Meltzoff, 1995). By the age of 2 
years, children usually adopt a fundamental aspect of ToM 
regarding people (but not inanimate objects), as evidenced by 
their ability to understand some goals and intentions of others 
(Meltzoff, 1999). By the age of 3 years, children show some 
competency in dealing with false belief tasks, and by the age of 
4 years, their performance on such tasks improves (Leslie, 
1994). Yet, three- and 4-year-old children explanations imply a 
conviction that belief and appearances always match reality, 
and that there is only one perspective. By the age of 5 years, 
they can understand the fact that belief and reality are not 
always the same (Carlson & Moses, 2001). 


Children 6 years and older are able to represent wrong 
beliefs and to construct a deceitful or truthful utterance relative 
to a person’s wrong beliefs. During this period, several other 
related abilities emerge, such as understanding the absence of 
knowledge in other people’s thinking, constructing deceitful 
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utterances, and recognizing deceptive plans from critical 
utterances in the context of conflicting goals (Wimmer & Pemer, 
1983). ToM plays an important role in child development and is 
clearly evident in many activities such as social interaction with 
peers, adults and younger children; engagement in pretense; 
and participation in group games such as hide and seek 
(Wellman & Peterson, 2013). 


The development of ToM interrelates with different cognitive 
abilities such as executive functions. In particular, it has been 
argued that the development of these functions usually requires 
conceptual change, which enables specific conceptual 
orientations to develop that are essential to the development of 
ToM (Wellman, Cross & Watson, 2001). Language also has an 
important role in children’s conception of mind (Hale & Tager- 
Flusberg, 2003). A meta-analysis indicated that children’s 
semantics, general language, syntax, and memory for 
complements are all strongly linked to children’s performance 
on false-belief tasks (Milligan, Astington & Dack, 2007). 


Number of siblings also has an effect on ToM: children from 
larger families perform better on ToM tasks than do children 
from smaller families, which suggests that the interaction 
between siblings and/or caregivers and children has a useful 
effect on the understanding of false belief (Perner, Ruffman & 
Leekam, 1994). ToM is also affected by culture: for example, 
children in the United Kingdom and those attending 
international schools in Hong Kong perform better on ToM tasks 
than do children attending local schools in Hong Kong (Wang, 
Devine, Wong & Hughes, in press). 


ToM is correlated with certain aspects of human life; for 
example, children with autism fail to employ ToM in social 
situations, and this is clear in the behavior these children when 
they are unable to impute beliefs to others; therefore, they are 
at a serious disadvantage when having to predict the behavior 
of others (Baron-Cohen, Leslie & Frith, 1985). They are also 
unable to manipulate others in simple situations (Sodian & Frith, 
1992). Patients with Huntington's disease also exhibit deficits 
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in ToM (Brune, Blank, Witthaus & Saft, 2011). Overall, people 
with ToM deficits tend to show negative and disorganization 
syndromes (Urbach, Brunet-Gouet, Bazin, Hardy-Baylé & 
Passerieux, 2013). 


ToM is measured by different tasks in different experimental 
settings. When the participant is required to manipulate other 
people’s knowledge, a sabotage and deception condition is 
used. In this task, each condition includes a cooperative and a 
deception trial, and at the end of each trial the participant is 
asked to explain why a doll deceived or physically hindered the 
competitor and why she helped the cooperator (Sodian & Frith, 
1992). Another commonly used ToM test is the "Smarties 
task," which uses the deceptive-appearance paradigm. In this 
task, children are shown a Smarties candy box (or any familiar 
candy box) and asked about its contents. When children reply 
"Smarties," the experimenter opens the box and reveals its 
contents which could be buttons, pencils or any other objects 
except Smarties. Then the experimenter asks the children to 
predict another person's response to the original question, 
"What is in this box?" (Perner, Frith, Leslie & Leekam, 1989). 


False-belief tasks are also used widely in ToM research, 
particularly for investigating the development of ToM. The 
tasks require drawing conclusions about an action or thinking of 
someone whose beliefs conflict with reality and with the 
participant’s own current knowledge (Wellman et al., 2011), and 
a prediction of what would the other person would do. This 
methodology is used with children as young as 18 months 
(Southgate, 2013). When the research aims to investigate 
neural correlates of ToM, it employs neuroimaging methods. 
Such studies have indicated that ToM relies on a specific set of 
brain regions commonly known as the ToM network 
(Schaafsma, Pfaff, Sount & Adolphs, 2015). 


Previous studies of the psychometric properties of ToM tasks 
by Mayes et al. (1996) examined the test-retest reliability of 
data from 23 children aged 36-71 months (mean age 49.6 
months).The children watched three videotaped stories 
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illustrating a false-belief situation, after each situation the 
experimenter narrated false belief tasks, and the children were 
asked to answer questions related to the story. The interval 
between test and retest was 2 to 3 weeks. The test-retest 
reliability for the false-belief questions was poor to moderate 
(Mayes et al., 1996). Muris et al. (1999) examined a ToM task 
that comprised interviews appropriate for children between 5 
and 12 years of age. The test consists of pictures, stories, and 
drawings about which the child has to answer a number of 
questions, and it contains three subscales. Seventy normally- 
developing children participated, 12 of the participants were 
retested with an 8-week interval between test and retest. 
Results indicated sufficient test-retest stability. 


Hutchins, Prelock & Chace (2008) evaluated a ToM task 
battery that comprised 16 test questions within nine tasks and 
different complexity levels. The test was administered twice to 
17 children diagnosed with Autism Spectrum Disorder. Test- 
retest reliability was adequate, and internal consistency was 
high. Similarly, Devine & Hughes (in press) examined the 
psychometric properties of a task battery composed of items 
from Happé's Strange Stories task and Devine and Hughes' 
Silent Film task. 460 ethnically and socially diverse children 
between 7 and 13 years old participated by completing the task 
battery at two time points separated by 1 month. The ToM test 
exhibited strong test-retest reliability. Clemmensen et al., 
(2016) studied aspects of validity and reliability of the Danish 
version of the ToM-storybook Frederik as a measure of ToM 
deficits. Their results support the validity of this tool because it 
was able to identify expected ToM deficits at the group level. 
The test-retest reliability estimate of the ToM-Frederik Total 
score was also good; therefore, overall, their findings support 
the validity of the Danish version of the ToM Storybook Frederik 
as a measure of ToM. 


The purpose of the present study is to translate and assess 
the reliability and validity of the ToM battery task designed by 
Hutchins & Prelock (2010), for use in assessing ToM in 
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normally-developing children. To our knowledge, ToM has not 
been studied in depth in Saudi Arabia, nor in the Arab world in 
general. We expected the ToM task battery (the Arabic 
version) to be as valid and reliable as the original version. 


Method 


Participants 
A total of 367 normally-developing children (823 girls, 44 


boys), ranging in age between 3 and 12 years (M = 8.4) 
participated. Participants were recruited from 12 schools 
covering the four main educational sectors (north, south, east, 
and west) in Riyadh, Saudi Arabia. Most (84%) were Saudi 
children; the rest represented other Arab nationalities (Egypt, 
Sudan, Syria, Yemen, Palestine, and Jordan). All participants 
were monolingual Arabic speakers, grew up in Saudi Arabia 
and were studying in Saudi public schools. 


Measures 

ToM task battery. The ToM task battery is designed to 
assess a range of content and complexity levels across social 
and cognitive domains (Hutchins, Bonazinga, Prelock, & Taylor, 
2008). It comprises \° test questions within nine tasks, and has 
different complexity levels. 


Task A (Emotion recognition) aims to test the ability to 
identify emotions associated from four different facial 
expressions (happy, sad, mad, and scared). In Task B (Desire- 
based emotion), children’s understanding of desire is assessed 
by asking them to infer an emotion based on a character's 
desire. Task C (Seeing leads to knowing) measures more 
advanced abilities involving the inference of belief-based 
emotion, reality-based emotion, and second-order belief-based 
emotion. Task D (Line of sight) targets the ability to infer a 
perception-based belief, i.e., the ability to understand that 
people in different positions may see different things. Task E 
(Perception-based action) uses the classic false-belief change- 
location task to test the ability to understand that perception 
influences behavior. Task F (Standard false-belief task) 
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assesses the ability to infer a desire-based belief in the context 
of an unexpected change of location. Task G (Belief- and 
reality-based emotion and second-order emotion task) 
evaluates the understanding that beliefs, along with events 
contrary to beliefs, can cause emotion. Task H (Message- 
desire discrepant task) assesses the ability to infer the belief of 
someone else when interpreting a statement of desire in the 
context of a change in location. Task | (Second-order false- 
belief task) involves thinking about what someone else thinks 
about what someone else thinks, and it includes the element of 
a false belief (Hutchins & Prelock, 2010). 


Procedures 

The approval of the ToM task battery’s authors was secured 
to translate and standardize the test on an Arabic population, 
after which a back to back translation was conducted by Arabic 
— English bilingual academics in King Saud University. 


The names of characters were changed to suit most Arab 
cultures, and certain situations as well; for example, “birthday” 
was changed to Eid Alfiter (the festival celebrated after the holy 
month of Ramadan), because not all people in Saudi Arabia 
celebrate birthdays. The approval of the Ministry of Education 
in Saudi Arabia was also taken to conduct this research in 
Saudi schools, and to recruit the child participants. Children 
were recruited from 12 schools (8 elementary and 4 
preschools). The test, which takes 15 to 25 minutes to 
complete, was administered in a familiar, quiet room in the 
school. The researcher first told the childe "| am going to read 
you some short stories, and then | will ask you some questions 
about these stories. You can answer me by pointing to the 
pictures or by using words.” As soon as the child pointed at or 
verbally expressed an answer, the researcher entered the 
child’s response on the answer form. To calculate test-retest 
reliability, 49 participants were retested on the ToM battery with 
an interval of 10 to 14 days. 
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Results 
Item-total correlations of the Arabic version of the ToM task 
battery are shown in Table 1. 


Table 1 Item-total correlations for the Arabic version of the ToM 
task battery 


= e 
coefficient coefficient 

i E a E E A 
C e e a 
Ca fow |a| o ps ë a 


a fos |o e o oa 
s fos | 1 | 13 foams ë Joa 
e poaa oa a os ea 
z po |a 18 os ë oa 
Ce pa | 

Table 1 shows that all 15 ToM questions are valid and that 
all correlation coefficients are significant at p < .001. 





Test-retest reliability results for the first (T1) and the second 
(T2) testing are shown in Table 2. 


Table 2 Test-retest correlations 


Cronbach’s Cronbach’s 

ToM |n alpha for first alpha for second SRA (T1) 
testing (T1) testing (T2) 

eS 


0.728 | ose | 692 | 0.687 687 





As Table 2 indicates, H consistency at BE and (T2) 
were both moderate to high, as was the correlation between the 
two. p = .001 this result indicates that the test is reliable and 
can be used for normally-developing children 3 to 12 years of 
age. 


Descriptive and Normative Data 
Item difficulty. Each item in the ToM task battery was 


tested for its difficulty (showing in table 3). This analysis is 
necessary to ensure that both easy and difficult items are 
included, and, insofar as possible, to order the tasks from 
easiest to most difficult. 
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Table 3 Difficulty index of all items, by task, for the ToM task 
battery 


| Item number | % who answered correctly 





Table 3 shows that the most difficult task in the ToM battery 
was Task D, which assesses the ability to infer perception- 
based beliefs (Line of sight task). In this task, only 21.8% of 
participants answered item 7 correctly and only 15.8% 
answered item 8 correctly, while all other items were answered 
correctly by more than 50% of the children. Hence, the line of 
sight task will be moved to the end of the battery in the Arabic 
adaptation of the test. 


Mean and standard deviation of total score by age. Data 
on the mean total score by age can be used as a general 
benchmark of expected performance (Hutchins & Prelock, 
2010). The means and standard deviations of the total score by 
age for the Arabic version are shown in Figure 1. 
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Figure I Means and Standard Deviations of total score, by age 

As the table indicates, ToM develops and progresses over 
time, with mean scores generally improving year upon year, as 
seen in figure 1. The mean standard deviation for this 
adaptation of the test was SD = 2.00. 


Discussion 

The aim of this study was to assess the reliability and validity 
of the ToM battery task designed by Hutchins & Prelock (2010) 
by measuring test-retest reliability and internal consistency after 
the test was translated and adapted to fit the Saudi and Arab 
culture. Internal consistency of the adapted version was 
significantly high, and the Cronbach’s alpha of .70 indicated 
acceptable reliability (Tavakol & Dennick, 2011). Taken 
together, the results support the use of this version for 
assessing ToM with normally-developing children in Saudi 
Arabia. 


Results also show that ToM develops gradually over the 
years, which is in line with previous studies that show steady 
improvement and the emergence of a number of abilities by the 
age of 6 years that make ToM stronger than at earlier ages 
(Wimmer & Pemer, 1983; Leslie, 1994; Gopnik et al., 2014). 


Analyses of item difficulty indicated some differences 
between the Arabic adaptation of the ToM task battery and the 
English test. Therefore, in comparison with the original test by 
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Hutchins & Prelock (2010), our arrangement of the tasks will be 
changed to place Task D (Line of sight) as the final task 
because this task was the most difficult for all ages. Henceforth, 
the Arabic adaptation of the ToM test will be as follow; Task A: 
Emotion recognition, Task B: Desire-based emotion, Task C: 
Seeing leads to knowing, Task D: Perception-based action, 
Task E: Standard false-belief task, Task F: Belief- and reality- 
based emotion and second-order emotion task, Task G: 
Message-desire discrepant task, Task H: Second-order false- 
belief task, and the final task is Task l: Line of sight. 


Limitations of this study include the small number of 
participants, particularly in the younger ages. Hence, the use of 
a larger number of participants and inclusion of greater socio- 
cultural and geographic diversity in the country are advised for 
future research. ToM work would also be advanced by studies 
of this instrument elsewhere in the Arab world, and with children 
affected by conditions such as autism and deafness. Despite 
this limitation, the Arabic adaptation of the test overall can be 
considered a reliable instrument for assessing ToM with 
normally-developing children. 
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